Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter regex patterns in addition to prefixes #59

Merged
merged 7 commits into from
Sep 28, 2017

Conversation

dhood
Copy link
Member

@dhood dhood commented Aug 18, 2017

this will allow us to filter out lines based on regexes instead of just prefixes. note that even if you specify a multiline regex it will only ever be matched against one line of console output at a time (this was also true how we were matching the prefixes before)

this was motivated by wanting to filter things like the fastrtps error message in
http://ci.ros2.org/view/nightly/job/nightly_osx_repeated/772/testReport/(root)/projectroot/test_tutorial_add_two_ints_server_add_two_ints_client_async__rmw_fastrtps_cpp/
where prefix matching wouldn't be sufficient. Even though we probably don't want to filter that error message in general, I found this useful while investigating flaky tests. Using this I could run tests N times ignoring this error, preventing unrelated test failures from its output.

CI including relevant changes in dependencies (for which I'll open PRs if this change looks OK conceptually)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@dhood dhood added the in review Waiting for review (Kanban column) label Aug 18, 2017
@dhood dhood self-assigned this Aug 18, 2017
@dirk-thomas
Copy link
Member

dirk-thomas commented Aug 18, 2017

This breaks any code currently using this functionality. Maybe the existing API can be kept and the pattern filter introduced beside it?

Updating the rmw implementations to always provide patterns seems fine since that won't affect users (except those implementing their own rmw impl. with custom filters).

@dhood
Copy link
Member Author

dhood commented Aug 18, 2017

If every prefix can be expressed as a regex, I think it's better for us to drop support for the prefix option so that we don't have to maintain it and just update our usage to the new API (no one else on github uses this function)

If, however, we really like the simplicity of the prefix option, we could keep support for it, but that's a different reason

@dhood
Copy link
Member Author

dhood commented Aug 23, 2017

@dirk-thomas are you in favour of keeping support for the prefix filtering, or was it just an idea you were floating? If so, I can change it, it's easy to do, I'd just like to understand the reasoning (since I think you're usually in favour of maintaining as few options as possible)

@dirk-thomas
Copy link
Member

are you in favour of keeping support for the prefix filtering, or was it just an idea you were floating?

Yes, I think we should keep supporting prefix filtering. While every prefix can be expressed as a regex it goes imo against the design principle that a user shouldn't "pay" for a feature he isn't using.

since I think you're usually in favour of maintaining as few options as possible

If multiple options achieve the same think I am certainly in favor reducing them to the minimal amount. In this case I see the two options as functionality different though: the prefix filtering is the "best" approach if that is sufficient for the use case, the regex allows more complicated filters but come with the "price" of additional overhead.

@dhood
Copy link
Member Author

dhood commented Aug 23, 2017

ok, cool. originally it seemed because of api breakage. this makes sense to me thanks, will update

@dhood dhood added in progress Actively being worked on (Kanban column) and removed in review Waiting for review (Kanban column) labels Aug 23, 2017
@dhood dhood force-pushed the filtered_patterns_not_prefixes branch from b617866 to eb33c9f Compare September 6, 2017 16:40
@dhood
Copy link
Member Author

dhood commented Sep 6, 2017

This PR has been updated to add filtered_patterns as an option (and update the RMW-specific filters to be regexes instead of prefixes)

CI with the attached PR updating the connext license output ros2/rmw_connext#242:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status (unrelated flaky test)

@dhood dhood added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Sep 6, 2017

if filtered_rmw_implementation:
rmw_output_filter = get_rmw_output_filter(filtered_rmw_implementation)
self.filtered_prefixes.extend(rmw_output_filter)
self.filtered_patterns.extend(rmw_output_filter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should the rmw filters need to be patterns? If a prefix filter is sufficient it would be good if an rmw impl. could use that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood from your comment above that, while we want users to give users the choice between patterns or prefixes, it's not necessary for "internal" rmw filters:

Updating the rmw implementations to always provide patterns seems fine since that won't affect users (except those implementing their own rmw impl. with custom filters).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was 20 days ago - a lot has happened since then 😉

Sure, we could make all rmw impl. use regex since we kind of maintain most of them. That was the main rational in the first review. But my latest comment was about efficiency - why use a regex if a prefix does the job just fine? I haven't spend any thought on how this could be done. An rmw impl. would need to somehow declare the type of their filters - either by using a different resource key or embedding the type information in the resource. The first seems to be easier.

@@ -70,6 +77,9 @@ def on_stdout_lines(self, lines):

for line in lines.splitlines():
# Filter out stdout that comes from underlying DDS implementation
# Note: we do not currently support matching filters across multiple stdout lines.
if any(re.fullmatch(pattern, line) for pattern in self.filtered_patterns):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a fullmatch? Could it just use match instead and leave it up to the regex to use ^ and $?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was just trying to avoid regexes from matching accidentally, but I agree that that should be the choice of the user.

If we just use match, which only matches at the start of a line, then actually the prefix option could then be collapsed back into the pattern option... in which case we can just document that filters can be literal strings of prefixes for those users just wanting the simplicity of specifying prefixes. How's that sound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can determine if the filter is supposed to be a string or a regular expression. E.g. if it contains a . you can't guess the semantic. Therefore I think you need to know the type of the filter (prefix string vs. regex) in order to invoke the corresponding logic.

Again, I think we should never use a regular expression match when a string match is sufficient simply because of the unjustified overhead.

@tfoote tfoote added in progress Actively being worked on (Kanban column) and removed in review Waiting for review (Kanban column) labels Sep 8, 2017
@dhood dhood changed the title Filter regex patterns instead of prefixes Filter regex patterns in addition to prefixes Sep 27, 2017
@dhood
Copy link
Member Author

dhood commented Sep 28, 2017

OK this PR now extends both the API and the rmw-specified filters to accept both prefixes and regexes. ros2/rmw_connext#242 has been updated to clarify it's registering a prefix to filter but otherwise no change to the codebase.

CI with this PR and ros2/rmw_connext#242:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status (Warnings from fastrtps that have been reported)

@dhood dhood added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Sep 28, 2017
if any(line.startswith(prefix) for prefix in self.filtered_prefixes):
continue
if any(re.match(pattern, line) for pattern in self.filtered_patterns):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was first worried that using the same pattern without compiling it repeatedly would be slower. But it turns out modern Python already do a great job compiling the pattern and caching it for future use. So all good here 😃

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I should have mentioned that I checked the same thing, to avoid you looking into it. it might be worthwhile if there are a lot of unique regexes, but for now we don't have that case

@dhood dhood merged commit 3c91f3a into master Sep 28, 2017
@dhood dhood deleted the filtered_patterns_not_prefixes branch September 28, 2017 22:26
@dhood dhood removed the in review Waiting for review (Kanban column) label Sep 28, 2017
@dhood
Copy link
Member Author

dhood commented Sep 28, 2017

thanks for the nitpick commits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants